i DOCUMENT BUILDER CLASSES AND METHODS 

2 

3 FIELD OF THE INVENTION 
4 

5 This invention relates generally to object oriented computer programming, and more 

6 particularly to software tools for programmatically developing documents. 

7 

8 BACKGROUND OF THE INVENTION 

9 

10 Figure 1 illustrates an exemplary web page 100. The web page 100 is displayed in a 

1 1 browser window 1 10 having a title bar 120. The web page 100 comprises four text sections 

12 132, 134, 136 and 138, in various styles and formats, a graphic image 140, a horizontal line 

13 150 and a table 160 containing several hyperlinks 170. 
14 

15 Figure 2 illustrates an HTML (hypertext mark-up language) document 200 

] 6 corresponding to the web page 100. A browser program generates the web page 100 using 

1 7 the HTML document 200 as input. The basic building blocks of an HTML document are 

1 8 tags. Each tag is sandwiched between angle brackets ("<" and **>"). As an example, the first 

19 four tags in the HTML document 200 are "<HTML>," "<HEAD>/ ? "<TJTLE>" and 

20 "</TITLE>." Tags come in two types: opening tags and closing tags. A Closing tag has a 

21 backslash ("/") after the left angle bracket ("<") and is otherwise the same as its matching 

22 opening tag. In the HTML document 200, the first example of a matched pair of opening and 

23 closing tags are "<TITLE>" and "</TITLE>" on the third line. Between these opening and 

24 closing tags is an argument (in this case, "OO Objects/Classes/Instances"), which shows up in 

25 the title bar 120 of the browser window 110. The HTML document 200 includes many other 

26 examples of matched pairs of opening and closing tags, like "<B>" and "</B>" for bold, 

27 "<I>" and "</I>" for italics, "<FONT . . . >" and "</FONT>" for font selection, 

28 "<CENTER>" and "</CENTER>" for horizontal centering, and "<TABLE>" and 

29 "</TABLE>" for a table. In fact, the entire HTML document 200 spans from the opening tag 

30 "<HTML>" to the closing tag "</HTML>." Not all opening tags have a closing tag. 
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1 Examples include "<BR>" for a line break, "<IMG>" for an image, "<HR>" for a horizontal 

2 rule or line, and "<-->" for comments. 
3 

4 Note that the HTML document 200 is illustrated in Figure 2 as having uppercase tags. 

5 One may choose to use lowercase tags instead. In fact, XHTML (extensible HTML) requires 

6 that tags be in lowercase. XHTML also requires that an opening tag for which there is no 

7 closing tag be in the form "<TAG/>." 

8 

9 Sections of the HTML document 200 are labeled with reference numbers having the 

10 same last two digits as reference numbers used to label corresponding parts of the web page 

11 1 00. For example, the HTML title section 220 gives the web page its title in the title bar 120. 

12 The HTML section 232 produces the top text section 132 ("Understanding Object Orientation 

13 Concepts") in an arial font, in an augmented size and in a particular color. The HTML 

14 section 234 produces the next line of text 134 in a bold format. The HTML section 236 

15 produces the next paragraph of text 136, including several italicized words. The HTML 

16 section 238 produces the text section 138. The HTML section 240 produces the graphic 

17 image 140, centered horizontally on the web page 100, by referencing a graphics file (.gif). 

18 The HTML statement 250 produces the horizontal line 150. The HTML table section 260 

19 produces the table 160, having three entries in a row. Each entry is a hyperlink 170, produced 

20 by the anchor statements 270 in the HTML document 200. The HTML document 200 may 

21 also contain comments (not shown) that do not appear in the browser window 110. 

22 

23 The conceptually simplest method for creating a markup language document is to type 

24 it manually using a text editor or word processor. However, manual preparation of 

25 documents is extremely labor intensive. Furthermore, manual preparation of documents is 

26 error prone. Even if additional labor is expended checking the document for errors or poor 

27 style, malformed documents can (and do) still result. Common errors are omission of 

28 required closing tags and having closing tags in the wrong order. Examples of poor style 

29 include not enclosing values in quotations (e.g., "arial," "+2" or "red" in the font tag in the 

30 HTML section 232), typing special characters directly rather than their escape code (e.g., 
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1 "&") in the HTML section 234 is the escape code for the ampersand character ("&"), and 

2 inadequate commenting. Errors and/or poor style in an HTML document can produce 

3 unpredictable results on different browsers. 
4 

5 Good practice demands that manually prepared HTML documents be validated. 

6 Although there exist HTML validation programs that can read an HTML file and report any 

7 errors or poor style, use of such validation programs requires extra time and effort. 

8 Furthermore, case by case validation processes are neither scalable nor extensible. As a 

9 result, it is difficult to generate a large number of consistent documents manually. 
10 

1 1 There are tools available to automate, to some degree, the generation of web pages. 

12 These tools are programs whose output is a markup language document. Examples of these 

13 tools are HTML editors and the automatic web page generator disclosed in U.S. Patent 

14 5,940,834. Page-based HTML editors typically present a browser view of a web page on 

15 which a user can enter and graphically manipulate items. Code-based HTML editors are 

16 essentially text editors enhanced with pull down menus, dialog boxes, shortcuts or other 

17 commands for entering tags in a quicker or more user friendly manner. Though simplifying 

18 HTML document creation for some authors, HTML editors fall short of providing complete 

19 automation and are not perfectly suited for high volume production. 
20 

21 The automatic web page generator disclosed in U.S. Patent 5,940,834 is a software 

22 program that presents a user with menus by which the user can add, delete or modify 

23 information about individuals in an organization (e.g., employees in a company). The output 

24 of the software program are HTML documents that produce a web-based personnel directory 

25 (e.g., employee telephone directory). Authoring software such as that software program (or 

26 an HTML editor) is a time-consuming endeavor that requires specialized skills and 

27 knowledge of markup languages. Like any good software, HTML generating programs 

28 ideally produce error-free and easy to read output, are extensible, scalable, robust, have 

29 intuitive appeal and are themselves easy to read. Such an ideal is difficult to achieve. 
30 
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1 SUMMARY OF THE INVENTION 

2 

3 The invention is computer readable media and methods associated with a software 

4 development tool that is useful when authoring programs that generate documents, such as 

5 markup language documents. 

6 

7 In one respect, the invention is a computer readable medium on which is embedded 

8 computer software. The software comprises a base class, an inline class and a container class. 

9 The base class defines a parent-child relationship by which a child object is stored within the 

I o storage space of its parent object The inline class is an extension of the base class, wherein a 

I I member of the inline class is permitted to be a child object but prohibited from being a parent 

12 object. The container class is an extension of the base class, wherein a member of the 

13 container class is permitted to be a child object and/or a parent object. A well-formed 

14 document can be modeled in software using members of the inline and/or container classes. 
15 

16 in another respect, the invention is a method of using the a set of classes to develop a 

17 document-producing program. The set of classes comprises the base class, inline class and 

18 container class. 
19 

20 In yet another respect, the invention is a document-producing program produced by 

21 the preceding method. 

22 

23 In comparison to known prior art, certain embodiments of the invention are capable of 

24 achieving certain advantages, including some or all of the following: ( 1 ) documents, such as 

25 markup language documents, can be programmatically generated, allowing improved 

26 development of automatic web page generators, for example; (2) the solution is scalable, 

27 reusable, extensible and flexible; and (3) the occurrences of malformed documents can be 

28 controlled or eliminated. Those skilled in the art will appreciate these and other advantages 

29 and benefits of various embodiments of the invention upon reading the following detailed 

30 description of a preferred embodiment with reference to the below-listed drawings. 
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1 

2 BRIEF DESCRIPTION OF THE DRAWINGS 

3 

4 Figure 1 illustrates an exemplary web page; 

5 Figure 2 illustrates an HTML document that produces the web page of Figure 1 ; 

6 Figure 3 illustrates a program, according to an embodiment of the invention, that 

7 produces as output the HTML document of Figure 2; 

8 Figure 4 is a class hierarchy diagram, according to an embodiment of the invention; 

9 Figure 5-9 illustrate pseudocode of various classes, according to an embodiment of the 

10 invention; 

1 1 Figure 1 0 is a table of function call usage parameters and their corresponding returned 

12 values, according to an embodiment of the invention; and 

13 Figure 11 is a flowchart of a method of using the software, according to an 

14 embodiment of the invention. 
15 

1 6 DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT 

17 

18 Figure 3 illustrates a program 300, according to an embodiment of the invention, that 

19 produces as output the HTML document 200. In other words, the program 300 is an HTML 

20 generator (also called a "web page generator"). The program 300 utilizes certain object 

2 1 oriented classes that are described in greater detail in Figures 4- 1 0. For now, the program 300 

22 will be described generally. The reader will better appreciate the program 300 in greater 

23 depth after reading the detailed descriptions of Figures 4-10. 



24 

25 The program 300 is a C++ main program. The program 300 includes a variable 

26 declaration section 310. The variable "document" is a data structure that ultimately contains 

27 the HTML formatted document, the output product of the program 300. Code sections of the 

28 program 300 are labeled with reference numbers having the same last two digits as reference 

29 numbers used to label corresponding sections of the HTML document and corresponding 

30 parts of the web page 100. The code section 332 adds the HTML section (producing the text 
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1 "Understanding Object Orientation Concepts") to the document. The code sections 334, 336 

2 and 338 illustrate alternative techniques for adding text-producing HTML to the document, 

3 particularly the HTML sections 234, 236 and 238. The code section 336 calls a function (not 

4 shown) to query a database. This function returns a data structure that includes the text 136 

5 and the graphics image 140. The code section 336 next calls another function to find the first 

6 occurrences of words beginning with "object," "class" and "instance" and to insert "<I> 

7 before and "</I>" after those instances. The code line 340 adds the HTML section 240, 

8 which produces the graphic image 140. The code line 350 adds the HTML tag 250 ("<HR>") 

9 that produces the horizontal line 150. The code section 360 adds the HTML section 260 that 

10 produces the table 160, which has three entries in one row. The code lines 370 add the 

1 1 HTML anchor commands 270 to the table, so as to produce the hyperlinks 170. 
12 

13 The program 300 is an intuitive way to programmatically generate the HTML 

14 document 200. The program 300, the HTML document 200 and the web page 100 are simple 

15 examples, but a programmer skilled in the art can readily appreciate that, in the same manner, 

1 6 very complex documents can be built using simple, easily understandable code. The power of 

17 this approach is due in part to the object classes on which the approach is based. Those object 

18 classes will now be described in greater detail, with reference to Figures 4-10. 
19 

20 Figure 4 depicts a class hierarchy 400, that illustrates the relationships among the 

21 various classes that are utilized, directly or indirectly, by the program 300. The classes exist 

22 in four layers, each layer depicted at a different indent in Figure 4. Together, these classes are 

23 referred to herein as "HTML builder classes." At the root or foundation layer is the single 

24 class htmlMarkup 410, This class serves as a specification and defines the basic internal 

25 relationships for the lower level classes. The htmlMarkup class 410 defines a rule for 

26 associating a child element with a parent container element. This rule is discussed in greater 

27 detail below. Under the htmlMarkup class 410 are two classes: htmllnlline 420 and 

28 htmlContainer 430, which are explained in detail below. Under these two classes are various 

29 extensions . For example, an anchor class 440 and an htmlText class 450 are extensions of the 

30 htmllnline class 420; a centered class 460, a table class 470 and an htmlDocument class 480 
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1 are extensions of the htmlContainer class 430. Optionally, higher abstraction classes are 

2 possible, such as a ModuleDispContainer class 490 and a ModuleTitleBaar class 495, both of 

3 which are extensions of the table class 470. 
4 

5 The htmlMarkup class 410 defines a parent-child relationship. Once this relationship 

6 is established between a child object and its parent object, the contents of the child data 

7 structure are moved from the child's storage into the parent's storage. Alternatively, the data 

8 could be copied, rather than moved, but that complicates memory management. Moving of 

9 the data is preferred, because doing so aids the developer (who uses the HTML builder 
10 classes) in proper memory management. The manner in which the data is stored is arbitrary. 
11 

12 The htmlContainer class 420 and the htmllnline class 430 are designed to model two 

1 3 basic types of HTML elements, each being optimized differently for their inherently different 

14 uses. The htmlContainer class 420 is permitted to be a parent, but the htmllnline class 430 is 

15 not. The htmlContainer class, therefore, can model HTML elements in which other HTML 

16 elements can be nested. Such HTML elements are herein termed "container" elements. The 

1 7 htmllnline class 430, on the other hand, best models HTML elements in which other HTML 

18 elements cannot be or typically are not nested. Such HTML elements are herein termed 

19 "inline" elements. Container elements always have opening and closing tags, whereas inline 

20 elements sometimes have only an opening tag. As a quick example, the entire HTML 

2 1 document 200 is an "HTML" container element (spanning from the opening tag "<HTML>" 

22 to the closing tag "</HTML>") containing many other elements. 
23 

24 Because a container element can contain any number of child elements, the 

25 htmlContainer class 420 preferably utilizes temporary files to store its information. 

26 Temporary storage is provided because it is often advantageous to allow the document 

27 generation program to construct parts of a document in tandem separately and then associate 

28 the parts as children to parents in a later step. Performance is improved by limiting the use of 

29 temporary files until needed and by limiting the amount of virtual memory required to house a 

30 document. Quite differently, an inline element cannot contain anything; therefore, storage for 
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1 the htmllnline class 430 is fixed to the class itself and preferably provided in virtual memory, 

2 which can be accessed more quickly. Performance if further improved by allowing child 

3 objects of either class to use their parent's storage if and when it becomes available. 

4 

5 Figures 5-9 illustrate pseudocode of various classes, according to an embodiment of 

6 the invention. Figure 5 illustrates pseudocode of the htmlMarkup class 410. The 

7 htmlMarkup class 410 includes a setParent function 510, that that is used to establish a 

8 relationship between a child object and its parent. Figure 6 illustrates pseudocode of the 

9 htmllnline class 420 and includes a destructor 610 that outputs a data structure "buffer," 

10 which is the basic data structure resident in virtual memory for inline HTML elements. 

1 1 Figure 7 illustrates pseudocode of the htmlAnchor class 440, which, in one embodiment, is a 

12 child of the htmllnline class 420. The htmlAnchor class 440 is invoked by the code lines 370 

13 of the program 300. As can be seen, a constructor function ("htmlAnchor") 710 constructs an 

14 HTML anchor tag ("<A .,.>") using the parameters passed in to the function. As an 

15 extension to the htmllnline class 420, the htmlAnchor class 440 utilizes the "buffer" data 

16 structure. 
17 

18 As an alternative, the htmlAnchor class 440 could be implemented as a container 

19 class. However, an anchor element only contains a very restricted number of element types, 

20 primarily text and images. By making the htmlAnchor class 440 inline, the builder classes are 

21 simpler in their programmatic interfaces and easier for developers to use and use correctly. 

22 

23 Figure 8 illustrates pseudocode of the htmlContainer class 430. The htmlContainer 

24 class 430 includes a destructor 810 that concatenates a child and parent. Thus, the parent 

25 "contains" the child. Figure 9 illustrates pseudocode of the htmlTable class 450. As can be 

26 seen, the htmlTable class 450 includes a constructor 910 that outputs a table start tag 

27 ("<TABLE>"). The htmlTable class 450 also includes a destructor 920 that outputs a table 

28 end tag ("</TABLE>"). The htmlTable class 450 includes an addRow function 930 that 

29 outputs the tag "<TR>"and an addContent function 940 that outputs content for one table cell 
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1 (column previously added row), as determined by input parameters, sandwiched between the 

2 tags "<TD>" and "</TD> " 

3 

4 The pseudocode in Figures 7 and 9 are representative of how the htmllnline class 420 

5 and the htmlContainer class 430, respectively, can be extended to model HTML elements. 

6 One skilled in the art of object oriented programming can appreciate how other HTML 

7 elements could similarly be modeled. A table 1000 listing other extension classes for various 

8 other HTML elements is shown in Figure 10. Generally, each class provides a constructor 

9 that defines the HTML element and allows the caller to set any of its attributes. For example, 

10 the htmllmage class allows the caller to set the attributes src (source filename), alt, border, 

1 1 width, height, id, class and style. A constructor of the inline type generally outputs to a data 

1 2 structure "buffer" the required opening tag, any attribute settings and closing tag, if necessary. 

13 With the container type, a constructor generally outputs to a temporary file just the opening 

14 tag and a destructor generally outputs the closing tag to the temporary file. The table 1000 is 

1 5 not exhaustive. All of the possible attributes for all possible HTML elements can be found in 

16 references common to those skilled in HTML. 
17 

18 The extension classes listed in the table 1000, and similar ones not shown, can 

19 themselves be extended to provide more specific types of HTML elements. For example, a 

20 particular style of table can be modeled as an extension of the htmlTable class 470. In this 

2 1 way, the first three layers of HTML builder classes provide extensible way to build even more 

22 powerful document building tools. In essence, the advantages of object orientation (e.g., 

23 modularity, reusability, testability, supportability) are brought to bear on document building. . 
24 

25 Figure 11 is a flowchart of a method 1100 of using the HTML builder classes, 

26 according to an embodiment of the invention. The first three steps of the method 1100 

27 involve coding a source code program. First, a programmer declares (1110) one or more 

28 variables of the types defined by the HTML builder classes and/or extensions thereof. Next, 

29 the programmer calls (1120) functions defined by the HTML builder classes and/or 

30 extensions thereof. Finally, the programmer codes (1 130) other parts of the program source 
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1 code. The variable declaration step 1110, the function calling step 1 120 and the other coding 

2 step 1 130 may be performed in any order, according to the preference of the programmer. 

3 The resulting source code program may be in any object oriented language, such as C++, for 

4 example. After coding the source code program, the programmer links (1 140) the source 

5 code to a library in which is packaged source code of the HTML builder classes. Next, the 

6 linked code is compiled (1 150) to produce an executable program. The program that is the 

7 product of the coding steps 1 1 10-1 130 is a document-producing program in source code form. 

8 The program that is the product of the compilation step 1150 is a document-producing 

9 program in executable form. Execution ( 1 1 60) of the document-producing program generates 

10 an HTML document. In fact, execution (1 160) of the document-producing program under 

1 1 different input conditions results in various HTML documents. Finally, a web browser can be 

12 utilized (1 170) to view the web page produced by the HTML document. 
13 

14 The HTML builder classes, their extensions, and a document-producing program that 

1 5 utilizes the HTML builder classes and/or their extensions can exist in a variety of forms both 

16 active and inactive. For example, they can exist as software program(s) comprised of 

17 program instructions in source code, object code, executable code or other formats. Any of 

18 the above can be embodied on a computer readable medium, which include storage devices 

19 and signals, in compressed or uncompressed form. Exemplary computer readable storage 

20 devices include conventional computer system RAM (random access memory), ROM (read 

21 only memory), EPROM (erasable, programmable ROM), EEPROM (electrically erasable, 

22 programmable ROM), and magnetic or optical disks or tapes. Exemplary computer readable 

23 signals, whether modulated using a carrier or not, are signals that a computer system hosting 

24 or running the computer program can be configured to access, including signals downloaded 

25 through the Internet or other networks. Concrete examples of the foregoing include 

26 distribution of the HTML builder classes, their extensions or document-producing programs 

27 on a CD ROM or via Internet download. In a sense, the Internet itself, as an abstract entity, is 

28 a computer readable medium. The same is true of computer networks in general. 

29 
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1 Although an embodiment of the invention has been described above with reference to 

2 HTML, the invention is applicable to documents of other markup languages. For example, 

3 the HTML document 200 could just as easily be an XHTML, XML (extensible markup 

4 language) or SGML (standardized generalized markup language) document. Even more 

5 generally, the document could be in any format, such as, for example, a word processing file. 

6 For example, a WORDPERFECT (TM) file has "codes" that are similar to tags in that some 

7 codes come in opening-closing pairs that can contain other codes or text. The invention 

8 contemplates all such documents. 
9 

10 What has been described and illustrated herein is a preferred embodiment of the 

1 1 invention along with some of its variations. The terms, descriptions and figures used herein 

1 2 are set forth by way of illustration only and are not meant as limitations. Those skilled in the 

13 art will recognize that many variations are possible within the spirit and scope of the 

14 invention, which is intended to be defined by the following claims - and their equivalents - 

15 in which all terms are meant in their broadest reasonable sense unless otherwise indicated. 
16 
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